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GENERATING AND PROVIDING INFORMATION ABOUT 
EXPECTED FUTURE PRICES OF ASSETS 



BACKGROUND 

This invention relates to generating and providing information about 
5 expected future prices of assets. 

Among the kinds of information available at web sites on the Internet are 
current and historical prices and volumes of stock transactions, prices of 
put or call options at specific strike prices and expiration dates for various 
stocks, and theoretical prices of put and call options that are derived using 
10 formulas such as the Black-Scholes formula. Some web sites give 

predictions by individual experts of the future prices or price ranges of 
specific stocks. 

A call option gives the holder a right to buy an underlying marketable 
asset by an expiration date for a specified strike price. A put option gives 
15 an analogous right to sell an asset. Options are called derivative securities 
because they derive their values from the prices of the underlying assets. 
Examples of xmderlying assets are corporate stock, commodity stock, and 
currency. The price of an option is sometimes called the premium. 

People who buy and sell options are naturally interested in what 
20 appropriate prices might be for the options. One well-known formula for 
determining the prices for call and put options under ideaUzed conditions 
is called the Black-Scholes formula. Black-Scholes provides an estimate 
of call or put prices for options having a defined expiration date, given a 
current price of the underlying asset, an interest rate, and the volatility rate 
25 (sometimes simply called volatility) of the asset. Black-Scholes assumes 



constant interest rates and volatility, no arbitrage, and trading that is 
continuous over a specified price range. 

SUMMARY 

In general, in one aspect, the invention features a method in which data is 
5 received that represents current prices of options on a given asset. An 
estimate is derived from the data of a corresponding implied probability 
distribution of the price of the asset at a future time. Information about the 
probabiUty distribution is made available within a time frame that is useful 
to investors, for example, promptly after the current option price 
1 0 information becomes available. 

Implementations of the invention may include one or more of the 
following features. The data may represent a finite number of prices of 
options at spaced-apart strike prices of the asset. A set of first differences 
may be calculated of the finite number of prices to form an estimate of the 
1 5 cumulative probability distribution of the price of the asset at a future 

time. A set of second differences may be calculated of the finite number of 
strike prices from the set of first differences to form the estimate of the 
probabihty distribution function of the price of the asset at a fiiture time. 

In general, in another aspect, the invention features a method in which a 
20 real time data feed is provided that contains information based on the 
probability distribution. 

In general, in another aspect, the invention features a method that includes 
providing a graphical user interface for viewing pages containing financial 
information related to an asset; and when a user indicates an asset of 
25 interest, displaying probability information related to the price of the asset 
at a future time. 
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In general, in another aspect, the invention features a method that includes 
receiving data representing current prices of options on a given asset, the 
options being associated with spaced-apart strike prices of the asset at a 
future time. The data includes shifted current prices of options resulting 
5 from a shifted underlying price of the asset, the amount by which the asset 
price has shifted being different from the amount by which the strike 
prices are spaced apart. An estimate is derived from a quantized implied 
probability distribution of the price of the asset at a ftiture time, the 
elements of the quantized probabiUty distribution being more finely 
1 0 spaced than for a probability distribution derived without the shifted 
current price data. 

In general, in another aspect, the invention includes deriving from said 
data an estimate of an impUed probability distribution of the price of the 
asset at a future time, the mathematical derivation including a smoothing 
15 operation. 

Implementations of the invention may include one or more of the 
following features. The smoothing operation may be performed in a 
volatility domain. 

In general, in another aspect, the invention includes deriving a volatility 
20 for each of the future dates in accordance with a predetermined option 

pricing formula that links option prices with strike prices of the asset; and 
generating a smoothed and extrapolated volatiUty fiinction. 

Implementations of the invention may include one or more of the 
following features. The volatility ftmction may be extrapolated to a wider 
25 range of dates than the fixture dates and to other strike prices. The 

smoothed volatility function may be applicable to conditions in which the 
data is reUable under a predetermined measure of reliability. The imphed 
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volatility fiinction formula may have a quadratic form with two variables 
representing a strike price and an expiration date. The coefficients of the 
implied volatility function formula may be determined by applying 
regression analysis to approximately fit the impUed volatility function 
5 formula to each of the implied volatiHties. 

In general, in another aspect, the invention features a method that includes 
receiving data representing current prices of options on assets belonging to 
a portfoUo, deriving from the data an estimate of an imphed multivariate 
distribution of the price of a quantity at a future time that depends on the 
1 0 assets belonging to the portfoUo, and making information about the 
probability distribution available within a time frame that is useful to 
investors. 

In general, in another aspect, the invention features a method that includes 
receiving data representing values of a set of factors that influence a 
15 composite value, deriving from the data an estimate of an imphed 

multivariate distribution of the price of a quantity at a future time that 
depends on assets belonging to a portfoho, and making information about 
the probability distribution available within a time frame that is useful to 
investors. 

20 Implementations of the invention may include one or more of the 

following features. The mathematical derivation may include generating a 
multivariate probability distribution function based on a correlation among 
the factors. 

In general, in another aspect, the invention features a graphical user 
25 interface that includes a user interface element adapted to enable a user to 
indicate a future time, a user interface element adapted to show a current 
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price of an asset, and a user interface element adapted to show the 
probability distribution of the price of the asset at the future time. 

In general, in one aspect, the invention features, a method that includes 
continually generating current data that contains probability distributions 
5 of prices of assets at future times, continually feeding the current data to a 
recipient electronically, and the recipient using the fed data for services 
provided to users. 

In general, in another aspect, the invention features a method that includes 
receiving data representing current prices of options on assets belonging to 
10 a portfoho, receiving data representmg current prices of market 

transactions associated with a second portfoho of assets, and providing 
information electronically on the probabiUty that the second portfolio of 
assets will reach a first value given the condition that the first portfolio of 
assets reaches a specified price at a future time. 

15 In general, in another aspect, the invention features a method that includes 
receiving data representative of actual market transactions associated with 
a first portfoho of assets; receiving data representative of actual market 
transactions associated with a second portfolio of assets; and providing 
information on the expectation value of the price of first portfolio of assets 

20 given the condition that the second portfolio of assets reach a first 
specified price at a specified future time through a network. 

In general, in another aspect, the invention features a method that includes 
evaluating an event defined by a first multivariate expression that 
represents a combination of macroeconomic variables at a time T, and 
25 estimating (e.g., using Monte Carlo techniques) the probability that a 

second multivariate expression that represents a combination of values of 
assets of a portfoho will have a value greater than a constant B at time T if 
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the value of the first multivariate expression is greater than a constant A. 
The market variables represented by the first multivariate expression can 
include macroeconomic factors (such as interest rates), market preferences 
regarding the style of company fundamentals (large/small companies, 
5 rapid/steady growth, etc.), or market preferences for industry sectors. 

In general, in another aspect, the invention features a method that includes 
defining a regression expression that relates the value of one variable 
representing a combination of macroeconomic variables at time T to a 
second variable at time T that represents a combination of assets of a 
1 0 portfolio, and estimating the probabihty that the second variable will have 
a value greater than a constant B at time T if the value of the first variable 
is greater than a constant A at time T, based on the ratio of the probability 
of X being greater than A under the regression expression and the 
probability of x being greater than A. 

15 In general, in another aspect, the invention features a method that includes 
defining a current value of an option as a quadratic expression that 
depends on the difference between the current price of the option and the 
current price of the underlying security, and using Monte Carlo techniques 
to estimate a probability distribution of the value at a future time T of a 

20 portfolio that includes the option. 

The invention takes advantage of the realization that option prices for a 
given underlying asset are indicative of the market's prediction of the of 
the risk-neutral price of the underlying asset in the future (e.g., at the 
expiration of the option). Option price data may be used to derive the 
25 market's prediction in the form of an implied probability distribution of 

future risk-neutral prices. Additional explanation of the significance of the 
phrase risk-neutral is contained in the Appendix, 
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The implied probability distribution and other information related to it 
may be made easily available to people for whom the information may be 
useful, such as those considering an investment in the underlying asset, or 
a brokerage firm advising such an investor. 

5 Among the advantages of the invention are one or more of the following: 
Investors and prospective investors in an underlying asset, such as a 
publicly-traded stock, are given access to a key additional piece of current 
information, namely calculated data representing the market's view of the 
future price of the stock. Brokerage firms, investment advisors, and other 
10 companies involved in the securities markets are able to provide the 
information or related services to their chents and customers. 

Other features and advantages will become apparent from the following 
description and from the claims. 

DESCRIPTION 

15 Details of implementations of the invention are set forth in the figures and 
the related description below. 

Figures 1, 2, and 3 are graphs. 

Figure 4 is a block diagram. 

Figures 5, 6, and 7 are web pages. 

20 Figures 8 and 9 illustrate user interfaces. 

Figure 10 shows data structures. 

In general, the price of a call or put option is determined by buyers and 
sellers in the option market and carries information about the market's 
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prediction of the expected price of the underlying asset at the expiration 
date. (The information does not include the premium that investors require 
for bearing risk, which must be estimated separately. The average long- 
term value of the risk premium is about 6% per year for all stocks and may 
5 be adjusted for an individual stock's historical responsiveness to broader 
market movements.) 

The information carried in the prices of options having various strike 
prices and expirations is used to derive probability distributions of the 
asset's price at future times and to display corresponding information to 
10 investors, for example, on the World Wide Web. 

Basic method 

We first define some relevant terms. We define x as the strike price, c{x) 
as the theoretical call price function (the price of the call as a fimction of 
strike price), p{x) as the theoretical put price function, F{x) as the 
15 cumulative distribution function (cdf) of the price of the underlying asset 
at expiration; and/(x) as the probabiUty density function (pdf) of the asset 
price at expiration. By definition, /x) = F\x) (i.e., the probability density 
function is the derivative of the cumulative distribution function). 

The relationship between c{x),p{x\J{x\ and F{x) can be succinctly stated 
20 as: 



In words, the pdf is the second derivative of either the call price function 
or of the put price function A simple proof these relationships is given in 



F(x)-c'(x) + l-;?W; 
J{x)-c\x)=p\xr 



(la) 
(lb) 



25 



the Appendix. The Appendix also contains other detailed information 
relating to the features of the invention. 
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This so-called "second-derivative method" for computing implied 
probability distributions from option price data is known in the academic 
literature, but apparently not very well known. For example, the standard 
textbook "Options, Futures, and Other Derivatives," by John C. Hull 
5 (Fourth Edition, 1999; Prentice-Hall) mentions implied probabilities, but 
not the second-derivative method. Perhaps the best reference that we have 
been able to find is J, C. Jackwerth and M. Rubinstein, "Recovering 
probability distributions from option prices," J. Finance, vol. 51, pp. 1611- 
1631 (1996), which has only six prior references. This paper cites D. T. 
10 Breeden and R. H, Litzenberger, "Prices of state-contingent claims 

implicit in option prices," J. Business, vol. 51, pp. 631-650 (1978) as the 
originator of a second-derivative method, although the latter paper 
nowhere mentions probabilities. 

Approximating f(x)from finite bid and ask option prices 

1 5 Equations (la) and (lb) are obtained by assuming that the variable x is 
continuous and ranges from 0 to infinity. In practice, options are usually 
traded within certain price ranges and only for certain price intervals (e.g., 
ranging from $1 10 to $180 at $5 intervals). Thus, the call and/or put 
option prices are known only for a finite subset of strike prices. Under 

20 such circumstances, estimates of Equations (la) and (lb) can be computed 
by taking differences instead of derivatives as follows. 

We assume that the option prices c{x) and p{x) are quoted for a finite 
subset of equally-spaced strike prices x = k A, where n is an integer, and A 
is the spacing between quoted prices. Define c„ = c{n A), = p{n A). Then 
25 the first derivatives c\x) and p\x) at x = (k+ Vijls. may be estimated by the 
first differences: 
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<^«-M/2 = I . (7a) 



/?«+i/2 = 1 • (7b) 



The corresponding estimates of the cumulative distribution function: 
4iy2 = ^((^+i)A) are 

5 ^«.i/2 = l+^«.i/2 (8a) 

K^\n = Pn^vi (8b) 

The second derivatives c\x) and p\x) at x = « A may hkewise be 
estimated by the second differences, i.e., differences of the estimates of 
the first derivatives: 



..,^ Pn.^-2p„ + P„., ^ (9b) 



Either of these estimates of the second derivatives may be used as an 
estimate of the probability density values at x = n A, i.e.^y^nA) : 

l-Korl-c: (10) 

15 Moreover, the market prices of call and put options are usually given in 
terms of a bid-ask spread, and thus either the bid price or the ask price (or 
some intermediate value) may be used as the call or put option price. By 
using the bid and ask prices for both the call option and the put option, 
four estimates of F(x) and f(x) may be obtained. These estimates may be 
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combined according to their reliability in any desired way. For example, 
one might use the estimate derived from the put bid price curve for values 
of X less than the current price s of the underlying asset, and the estimate 
derived from the call bid price curve for values of x greater than s. 

Examples of c^, pn, ? fn shown in figures 1, 2, and 3 using 
the data of TABLE 1 (see below). 

Tabular data 

TABLE 1 below shows sample bid prices of call and put options for strike 
prices of an asset ranging from $110 to $180 at $5 intervals and the 

cumulative distribution values F^^y^ and probability density values 
computed according to Equations (7)-(10) above. 

In the table, the values for F^^y2 correspond to strike prices that are mid- 

way between the two strike prices used to compute F^^y^ . Thus, the 
cumulative distribution value shown to the right of the strike price $110 
actually corresponds to the strike price $1 12.5, and the value to the right 
of the strike price $115 actually corresponds to strike price $1 17.5, and so 
forth. 



11 
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Dynamic estimates for F(x) and f[x) . 

In Equations (7)-(10), the call and put option prices were assumed to be 
static in the calculation of the cumulative distribution function F(x) and 
probability density fiinction/(^vj for a finite subset of strike prices x~n A, 
In the real world, the price s of the underlying asset changes with time, 
and there will be a corresponding change in option prices. As a first order 
approximation, if the price s increases by a small amount 5, then the 
option price curves will effectively shift to the right by the amount 5. 
(Here, 5 may be either positive or negative. For a more precise discussion 
of the shift, see the Appendix.) As a result, the price c(x) or p(x) now 
quoted at strike price x may be used as an estimate for the option price on 
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the previous price curve at strike price x' = ;c - 5. As a result, the prices on 
the previous curve at a new discrete subset of strike prices x = w A - 5 
become effectively visible. Given enough movements of the underlying 
price, therefore, we can effectively compute estimates ofc(xX P(^X F(^) 
and/or/(x) for a subset of strike prices x that is much more closely spaced 
than the subset available at any one time. 

Extrapolating and smoothing probability distributions. 

In a typical options market, the option prices are available only for certain 
expiration dates. In addition, the option prices are more reliable for 
options that are actively traded, which are typically nearer-term options at 
strike prices near the underlying price. It is therefore desirable to 
extrapolate and interpolate probability distributions to times other than 
actual expiration dates and to wider ranges of strike prices. 

Any standard extrapolation and smoothing techniques may be used 
directly on the cumulative distribution values F^^y^ or probability density 
values /„ to give a smoothed and extrapolated estimate ofF(x) orf(x). 
Similarly, given such estimated curves for a discrete subset of future times 
T, standard interpolation and extrapolation techniques may be used to 
estimate such curves for other specified values of T, or for a continuous 
range ofr>0. 

A less direct but useful approach is to perform extrapolation and 
smoothing on an implied volatility function, which is then used to 
calculate the other functions, such as c(x),p{x), F(x), and/x). The 
volatility rate of an asset (often simply called its volatility) is a measure of 
uncertainty about the returns provided by the asset. The volatihty rates of 
a stock may typically be in the range of 0.3 to 0.5 per year. 
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An advantage of performing extrapolation and smoothing on implied 
volatility curves is that different types of volatility curves (so-called 
"volatility smiles") are known and can be used as a guide to the 
extrapolation and smoothing process to prevent "overfitting" of certain 
5 unreliable data points. 

The standard method of computing implied volatilities is to invert the 
Black-Scholes pricing formula (see Appendix) for the actual call price c(x) 
or put price p(x) of an underlying asset at a given strike price x, given the 
underlying price s (current price of asset), risk-free rate of interest r, and 

10 and r (expiration date). When this is done for a range of values of jc, an 

estimate of an implied volatility curve a(x) is obtained. This curve may be 
smoothed and extrapolated by any standard method to give a smoothed 
curve a (x). Then corresponding smoothed put and call price curves may 
be computed using the Black-Scholes pricing formula and differentiated 

15 once or twice to give a smoothed cdf or pdf Finally, given such estimated 
curves for a discrete subset of future times T, standard interpolation and 
extrapolation techniques may be used to estimate such curves for other 
specified values of T, or for a continuous range of r> 0. 

Another new way to compute implied volatilities is first to compute a 

A. 

20 finite subset of cdf values F^^y2 and then to invert the Black-Scholes cdf 
formula (see Appendix) at these values. When this is done for a range of 
values of x, an estimate of a generally different implied volatility curve 
a\(x) is obtained, called the cdf-implied volatility curve. Again, this curve 
may be smoothed and extrapolated by any standard method to give a 

25 smoothed curve a \(x). Then a corresponding smoothed cdf may be 

computed from the Black-Scholes cdf formula, and differentiated once to 
give a smoothed pdf Finally, again, given such estimated curves for a 
discrete subset of future times standard interpolation and extrapolation 
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techniques may be used to estimate such curves for other specified values 
of T, or for a continuous range of r> 0. 

Some advantages of using the cdf-impHed volatiUty curve rather than the 
conventional implied volatility curve are that the computations are 
5 simpler, at least from an estimate of F(x), and that it fits better v^ith the 
multivariate techniques to be discussed below. 

A particular method for finding a smoothed and extrapolated implied 
volatility curve a i (x,7) as a fimction of both strike price x and time T to 
expiration is as follows. The volatility curve is assumed to be 
1 0 approximated by a quadratic formula 

a \ {x,T) = ao + a\x a2:)^ a^T + a^'f' + a$xT, (14) 

The coefficients [Ui] are determined by regression to fit the available data 
regarding cri(x, 7) as closely as possible. Given the smoothed curve a \ (x, 
7), corresponding smoothed cdfs for different x'^ and Ts) may be 

15 computed from the Black-Scholes cdf formula for each time and 

differentiated once to give a smoothed pdf An altemative procedure, with 
nimierical advantages, is to use a quadratic fit like the above for a fimction 
a (x,7), and then invert the Black-Scholes cdf to find a \ {x, T). See the 
Appendix for the academic history of such approximations of S{x,T), 

20 Another usefixl variation is to fit a (x, T) with a quadratic fimction of x at 
times r which are specific expiration dates, then linearly interpolate at 
other times T 

Treatment of multiple assets 

The techniques described so far give probability distributions for the 
25 fixture values of a single asset based on option price data for that asset. 
However, in many cases an investor may be concemed with multiple 
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assets, for example all of the stocks in his or her portfolio, or in a mutual 
fund, or in a certain index. Moreover, the investor may be concerned with 
the relations between one group of assets and another. 

A general method for dealing with such questions is to generate 
5 multivariate probabihty distributions for all assets of interest. A 

multivariate cdf may be written as F(xu ^2, . . where the variables (xi, 
) are the values of the n assets of interest. 

We will assume that we know from the techniques described above or 
otherwise the marginal cdfs Fi(x\) for each of the individual variables. As 

10 a first step, we may define for each Xi a function j^ifx^), called a "warping 
function," such that;^i(Xv) is a standard normal (Gaussian) variable with 
mean 0 and variance 1. This is simply done by defining such that 
Fi(x\) = N(y\(x])) for all values of x, where N(x) denotes the cdf of a 
standard normal variable. The function ^^ifXi) may be simply described in 

15 terms of cr^ix-). See the Appendix. Under mild technical conditions such 
as having a marginal cdf that varies monotonically, such a warping 
function 3^i(Xi) has a well-defined inverse warping function Xif^;^). 

Second, we assume that we can find the historical pairwise correlations 
between the warped standard normal variables j/ifxi^. These correlations 
20 may be computed by standard techniques from any available set of 

historical asset price data. We denote by C the w x n correlation matrix 
whose entries are these historically-based correlations. Because each of 
the variables j^ifxi) is standard normal, the diagonal terms of C are all equal 
to 1. 

25 Now let Fdxu . . ^n) denote the cdf of a multivariate Gaussian random n- 
tuple with zero mean and covariance matrix C. Define 
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Then F{xu X2, x„) is a multivariate cdf that (a) has the correct (given) 
marginal cdfs Fi(x\); and (b) has the correct (historical) correlations 
between the warped standard normal variables y\(x\). We use this cdf to 
5 answer questions involving the variables (x\, ^2? . - - ? x«). 

For example, the investor might have a portfolio consisting of a given 
quantity of each of these assets. The value of such a portfolio is the sum 

X^hiXi +/22X2 + ... +hnXn, (15) 

where hi represents the quantity of the /th asset in the portfoho. The 
10 investor might be interested in an estimate of the probability distribution 
of the value x of the whole portfolio. 

Such an estimate may be obtained by Monte Carlo simulation. For such a 
simulation, a large number iV' of samples from the multivariate Gaussian 
cdfFdyu .-,^11) may be generated. Each sample (yu yn) may be 
15 converted to a sample (xu ^2, . . Xn) by using the inverse warping 

functions Xifyi), The value x of the total portfolio may then be computed 
for each sample. From these iV^ values ofx, the probabiUty distribution of 
X (e.g., its cdf F(x)) may be estimated. 

In practice, it is useful to save the multivariate samples in a large 
20 database. Then the cdf of any quantity whose value is a function of the 
variables (x\, X2, . . Xn) may be estimated from this database. For 
example, if the investor would like to know the cdf of some alternative 
portfoho with different quantities of each asset, this can be quickly 
determined from the stored database. 
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An investor may also determine the effect of one portfolio (or event(s) or 
variables such as interest rates, P/E ratios, public interest in a certain 
sector of the market) on another portfolio as follows. Assimie that the first 
portfolio is represented by x, where 

5 X^hiXi +h2X2+ ... -^Kxn, (30) 

where each x/ may be viewed as the price of a portfolio component, and 
the second portfolio is represented by j^, where 

y^glXl +^2X2+ ... +g„X„. (31) 

where each y,* may be viewed as the price of a portfolio component or 
10 more broadly as any macro-economic variable (macroeconomic, 
fundamental, or sector related). 

Consider the "what-if ' question: letting A and B be given positive 
constants, if x > .4 at time T, what is the probabiUty that jj; > 5 at time T. 
This question can be answered by creating a Monte Carlo database as 

1 5 above for the multivariate cdf F{xi , X2, . . , , x„) corresponding to time T, 
identifying those samples for which x>A, and then using only these 
samples to estimate the probability that;; > 5. More generally, any 
conditional cdf of the form F(x \ E) can be estimated similarly, where x is 
any function of the variables (xi, X2, . . ., x„) and E is any event defined in 

20 terms of the variables (xi, X2, . . x«). 

Similarly, suppose an investor would like to know whether it is reasonable 
to beUeve that a certain stock or portfolio x will have a value greater than a 
given constant A at time T. This kind of question can be addressed by 
estimating the conditional cdf of some other related and perhaps better- 
25 understood variable (or combination of variables) y at time T, given that x 
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> ^. If the resulting distribution for does not look reasonable, then the 
investor may conclude that it is unreasonable to expect that x>A. 

Applications that use the probability distribution information 

A wide variety of techniques may be used to accumulate and process the 
5 information needed for the calculations described above and to provide the 
information to users directly or indirectly through third parties. Some of 
these techniques are described below. 

As shown in figure 4, the probabiUty distribution information can be 
provided to users fi-om a host server 102 connected to a commimication 
10 network 104, for example, a public network such as the Internet or a 

private network such as a corporate intranet or local area network (LAN). 
For purpose of illustration, the following discussion assumes that network 
104 is the Intemet. 

The host server 102 includes a software suite 1 16, a financial database 
15 120, and a communications module 122. The communications module 122 
transmits and receives data generated by the host server 102 according to 
the communication protocols of the network 104, 

Also connected to the network are one or more of each of the following 
(only one is shown in each case): an individual or institutional user 108, an 
20 advertisement provider 1 10, a financial institution 1 12, a third party web 
server 1 14, a media operator 122, and a financial information provider 
106. 

The operator of the host server could be, for example, a financial 
information source, a private company, a vendor of investment services, or 
25 a consortium of companies that provides a centralized database of 
information. 
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The host server 102 runs typical operating system and web server 
programs that are part of the software suite 116. The web server programs 
allow the host server 102 to operate as a web server and generate web 
pages or elements of web pages, e.g., in HTML or XML code, that allow 
5 each user 108 to receive and interact with probability distribution 
information generated by the host server. 

Software suite 116 also includes analytical software 118 that is configured 
to analyze data stored in the financial database 120 to generate, for 
example, the impUed probability distribution of fixture prices of assets and 
10 portfolios. 

The financial database 120 stores financial information collected fi-om the 
financial information providers 106 and computation results generated by 
the analytical software 118. The fmancial information providers 106 is 
connected to the network 104 via a commimication link 126 or the 
1 5 financial information providers may feed the information directly to the 
host server through a dialup or dedicated line (not shown). 

Figure 4 gives a fimctional view of an implementation of the invention. 
Structurally, the host server could be implemented as one or more web 
servers coupled to the network, one or more applications servers running 
20 the analytical software and other applications required for the system and 
one or more database servers that would store the financial database and 
other information required for the system. 

Figure 10 shows an example of a data feed 150 sent fi*om the financial 
information provider 106 to the host server 102 through the 
25 communication link 126. Information is communicated to the host server 
in the form of messages 151, 152. Each message contains a stream of one 
or more records 153 each of which carries information about option prices 
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for an underlying asset. Each message includes header information 154 
that identifies the sender and receiver, the current date 155, and an end of 
message indicator 158, which follows the records contained in the 
message. 

5 Each record 153 in the stream includes an identifier 156 (e.g., the trading 
symbol) of an underlying asset, an indication 158 of whether the record 
pertains to a put or call, the strike date 160 of the put or call, the strike 
price 162 of the put or call, current bid-ask prices 164 of the underlying 
asset, bid-ask prices 166 for the option, and transaction volumes 168 

1 0 associated with the option. The financial information provider 1 06 may be 
an information broker, such as Reuters, Bridge, or Bloomberg, or any 
other party that has access to or can generate the information carried in the 
messages. The broker may provide information fi-om sources that include, 
for example, the New York Stock Exchange and the Chicago Board of 

15 Options Exchange. 

The financial database 120 stores the information received in the 
information feed fi-om the financial information providers and other 
information, including, for example, interest rates and volatilities. The 
financial database also stores the results generated by the analytical 
20 software, including probabiUty distribution fimctions with respect to the 
underlying assets and assets that are not the subject of options. 

The probability distribution information is generated continually (and 
essentially in real time) from the incoming options data so that the 
information provided and displayed to users is current. That is, the 
25 information is not based on old historical data but rather on current 
information about option prices. 
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In addition, other soft information can be accumulated, stored, and 
provided to users, including fundamental characteristics of the underlying 
assets, including prices, volatility values, beta, the identification of the 
industry to which the asset belongs, the yield, the price to book ratio, and 
5 the leverage. Other information could include calendars of earnings 

forecast dates, earnings forecasts, corporate action items, news items that 
relate to an industry, and the volume of institutional holdings. 

The messages from the information provider 106 may be sent in response 
to requests by the host server 102, the information may be sent to the host 

10 server 102 automatically at a specified time interval, or the information 
may be sent as received by the information provider from its sources. The 
financial database 120 may be maintained on a separate server computer 
(not shown) that is dedicated to the collection and organization of 
financial data. The financial database is organized to provide logical 

1 5 relationships among the stored data and to make retrieval of needed 
information rapid and effective. 

The user 108 may use, for example, a personal computer, a TV set top 
box, a personal digital assistant (PDA), or a portable phone to 
communicate with the network 104. Any of these devices may be running 
20 an Internet browser to display the graphical user interface (GUI) generated 
by the host server 102. 

The host server 102 may provide probabihty distribution information on 
the network 104 in the form of web pages and allow the individual user 
108, the financial institution 112, the third party web server 114, and the 
25 media operator 124 to view the information freely. The host company that 
runs the host server 102 may generate revenue by, for example, seUing 
advertisement space on its web pages to an advertisement provider 1 10. 
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The host server 102 may also provide proprietary information and 
enhanced services to individual users 108, financial institutions 112, third 
party web servers 1 14, and media operators 122 for a subscription fee. 

The host server 102 may have a direct link to the financial institutions 1 12 
5 to provide tailored information in a format that can be readily incorporated 
into the databases of the financial mstitutions 1 12. Financial institutions 
1 12 may include, for example, investment banks, stock brokerage firms, 
mutual fund providers, bank trust departments, investment advisers, and 
venture capital investment firms. These institutions may uicorporate the 
1 0 probability distribution information generated by the analytical software 
118 into the financial services that they provide to their own subscribers. 
The probability distribution information provided by the host server 102 
enables the stock brokerage firms to provide better advice to their 
customers. 

1 5 A third party web server 1 1 4 may incorporate probability distribution 
information into its web site. The information may be deHvered in the 
form of an information feed to the third party host of web server 1 14 either 
through the Internet or through a dedicated or dial-up connection. 

Figure 10 shows an example of a data feed 182 sent from the host server 
20 102 to the third party web server 114 through communication link 128. 
Data feed 182 carries messages 184 that include header information 186, 
identifying the sender and receiver, and records 188 that relate to specific 
underlying assets. 

Each record 188 includes an item 190 that identifies a fixture date, a 
25 symbol 1 92 identifying the asset, risk-neutral probability density 

information 193 and cumulative distiibution information 194. The record 
could also include a symbol identifying a second asset 195 with respect to 
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the identified future date, and so on. Other information could be provided 
such as a risk premium value with respect to the risk-neutral values. 

Examples of third party web servers 114 are the web servers of 
E*TRADE, CBS MarketWatch, Fidelity Investments, and The Wall Street 
Journal. The third party web server 1 14 specifies a list of assets for which 
it needs probability distribution information. Host server 102 periodically 
gathers information fi-om financial information provider 106 and its ovra 
financial database 120, generates the probability distribution information 
for the specified Ust of assets, and transmits the information to the third 
party web server 1 14 for incorporation into its web pages. 

Examples of the media operator 124 are cable TV operators and 
newspaper agencies that provide fmancial information. For example, a 
cable TV channel that provides stock price quotes may also provide 
probability distribution information generated by the host server 102. A 
cable TV operator may have a database that stores the probability 
distributions of all the stocks that are listed on the NYSE for a number of 
months into the fiiture. The host server 102 may periodically send updated 
information to the database of the cable TV operator. When a subscriber 
of the cable TV channel views the stock price quotes on a TV, the 
subscriber may send commands to a server computer to the cable TV 
operator via modem to specify a particular stock and a particular future 
date. In response, the server computer of the cable TV operator retrieves 
the probability distribution information fi-om its database and sends the 
information to the subscriber via the cable network, e.g., by encoding the 
probability distribution information in the vertical blank interval of the TV 
signal. 
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Likewise, a newspaper agency that provides daily transaction price quotes 
may also provide the probabilities of stock prices rising above certain 
percentages of the current asset prices at a predetermined future date, e.g., 
6 months. A sample listing on a newspaper may be "AMD 83 88 85 
A40%", meaning that the AMD stock has a lowest price of $83, highest 
price of $88, a closing price of $85 that is higher than the previous closing 
price, and a 40% probability of rising 10% in 6 months. 

The analytical software 118 may be written in any computer language 
such as Java, C, C++, or FORTRAN. The software may include the 
following modules: (1) input module for preprocessing data received from 
the financial data sources; (2) computation module for performing the 
mathematical analyses; (3) user interface module for generating a 
graphical interface to receive inputs from the user and to display charts 
aad graphs of the computation results; and (4) communications interface 
module for handling the communications protocols required for accessing 
the networks. 

Web pages and user interfaces 

A variety of web pages and user interfaces can be used to convey the 
information generated by the techniques described above. 

For example, referring to figure 5, a GUI 700 enables a user 108 to obtain 
a range of financial services provided by the host server 102. The user 108 
may see the implied probabilities of fixture prices of marketable assets 706 
having symbols 704 and current prices 708. The information displayed 
could include the probabiUties 714 (or 718) of the asset prices rising above 
a certain specified percentage 712 (or falling below a certain specified 
percentage 716) of the reference price 710 within a specified period of 
time 720. 
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For the convenience of the user 108, GUI 700 includes Unks 730 to 
institutions that facihtate trading of the assets. The host company that 
runs the host server 102 sells advertising space 728 on the GUI 700 to 
obtain revenue. The GUI 700 also has links 726 to other services provided 
5 by the host server 102, including providing advice on lifetime financial 
management, on-line courses on topics related to trading of marketable 
assets, research on market conditions related to marketable assets, and 
management of portfolios of assets. 

Referring to figure 6, the GUI 700 also may display an interactive web 
10 page to allow the user 108 to view the market's current prediction of future 
values of portfolios of assets. The past market price 734 and current 
market price 736 of the asset portfoUos 732 are displayed. Also displayed 
is the price difference 738. The GUI 700 displays the probability 744 (or 
746) that the portfoHo 732 will gain (or lose) a certain percentage 740 
15 within a specified time period 742. Examples of portfolios include stock 
portfolios, retirement 40 IK plans, and individual retirement accounts. 
Links 748 are provided to allow the user 108 to view the market's current 
forecast of future price trends of the individual assets within each 
portfolio. 

20 Referring to figure 7, in another user interface, the GUI 700 displays an 
interactive web page that includes detailed analyses of past price history 
and the market's current forecast of the probability distribution of the 
future values of a marketable asset over a specified period of time. The 
GUI 700 includes price-spread displays 750 representing the cumulative 

25 distribution values of the predicted future prices of an asset over periods of 
time. The price-spread display 750a shows the price distribution data that 
was generated at a time three months earlier. A three-month history of the 
actual asset prices is shown as a line graph for comparison to give the user 
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108 a measure of the merit of the price distribution information. The price- 
spread display 750b represents the predicted cumulative distribution 
values of the asset prices over a period of one month into the future. The 
left edge of display VSOb, of course, begins at the actual price of the asset 
5 as of the end of the prior three-month period, e.g., the current DELL stock 
price of $50. The probabiUty distribution information implies, for 
example, a 1% probability that the stock price will fall below $35, and a 
99% probably that the stock price will fall below $80 in one month. GUI 
700 includes table 752 that shows highhghts of asset information and 
10 graph 754 that shows sector risks of the asset. A box 755 permits a user to 
enter a target price and table 757 presents the probabiUty of that price at 
four different future times, based on the calculated implied probability 
distributions. 

Referring to figure 8, in another approach, a window 402 is displayed on a 
1 5 user's screen showing financial information along with two other windows 
408 and 410 showing probabiUty distribution information. The individual 
user 108 could have previously downloaded a client program from the 
host server 102. When the user is viewing any document, e.g., any web 
page (whether of the host server 102 or of another host's server), the user 
20 may highUght a stock symbol 404 using a pointer 406 and type a 

predetermined keystroke (e.g., "ALT-SHIFT-Q") to invoke the client 
program. The cUent program then sends the stock symbol as highlighted 
by the user to the host server 102, The host server 102 sends probabiUty 
distribution information back to the client program, which in turn displays 
25 the information in separate windows 408 and 410. 

When the client program is invoked, a window 422 may be displayed 
showing the different types of price information that can be displayed. In 
the example shown, the "Probability distribution curve" and 'Upper/lower 
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estimate curves" are selected. Window 408 shows the price range of AMD 
stock above and below a strike price of $140 from July to December, with 
90% probabihty that the stock price will fall between the upper and lower 
estimate curves. Window 410 shows the probabihty density curve /(jc) for 
5 AMD stock for a future date of 8/15/2000. The user may also specify a 
default function curve, such that whenever an asset name is highlighted, 
the default function curve is displayed without any further instruction from 
the user. 

Tabular data such as those shovra in TABLE 1 may be generated by the 
10 host server 102 and transmitted over the network 104 to devices that have 
limited capability for displaying graphical data. As an example, the 
individual user 108 may wish to access asset probability distribution 
information using a portable phone. The user enters commands using the 
phone keypad to specify a stock, a price, and a future date. In response, the 
15 host server 102 returns the probabihty of the stock reaching the specified 
price at the specified future date in tabular format suitable for display on 
the portable phone screen. 

Referring to figure 9, a portable phone 500 includes a display screen 502, 
numeric keys 506, and scrolling keys 504. A user may enter commands 
20 using the numeric keys 506. Price information received from the host 

server 102 is displayed on the display screen 502. Tabular data typically 
includes a long list of numbers, and the user may use the scroll keys 504 to 
view different portions of the tabular data. 

In the example shown in display screen 502, the AMD stock has a current 
25 price of $82. The cumulative distribution values F{x) for various future 

prices on 8/15/2000 are listed. The distribution indicates a 40% probability 
that the stock price will be below $80 implying a 60% probability of the 
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stock price being above $80. Likewise, the distribution indicates an 80% 
probability that the stock price will be at least $90, implying a probability 
of 20% of the stock price being above $90. 

Other embodiments are within the scope of the following claims. 
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1 1 , A method comprising: 

2 receiving data representing current prices of options on a given 

3 asset, 

4 deriving from said data an estimate of a corresponding implied 

5 probability distribution of the price of said asset at a future time, and 

6 making information about said probability distribution available 

7 within a time frame that is useful to investors. 

1 2. The method of claim 1 in which the data represent a finite number 

2 of prices of options at spaced-apart strike prices of the asset, and also 

3 including 

4 calculating a set of first differences of said finite nimiber of prices 

5 to form an estimate of the cumulative probability distribution of the price 

6 of said asset at a future time, 

1 3, The method of claim 2 also including 

2 calculating a set of second differences of the finite number of 

3 strike prices from the set of first differences to form an estimate of the 

4 probability distribution function of the price of said asset at a fiiture time. 

1 

1 4. A method comprising: 

2 receiving data representing current prices of options on a given 

3 asset, 
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4 deriving from said data an estimate of a corresponding implied 

5 probability distribution of the price of said asset at a future time, and 

6 providing a real-time data feed containing information based on 

7 said probability distribution. 

1 5. A method comprising: 

2 providing a graphical user interface for viewing pages containing 

3 financial information related to an asset; and 

4 when a user indicates an asset of interest, displaying probability 

5 information related to the price of the asset at a future time. 

1 6. A method comprising: 

2 enabling a user to identify an asset of interest, the asset being one 

3 for which data representing current prices of options on the asset are 

4 available, 

5 deriving from said data an estimate of a corresponding implied 

6 probabiUty distribution of the price of said asset at a ftiture time, and 

7 providing a display of a probabiUty distribution of prices of the 

8 asset at future times. 

1 7. A method comprising: 

2 enabling a user to indicate a future time and to identify an asset of 

3 interest, the asset being one for which data representing current prices of 

4 options on the asset are available, and 

5 displaying to the user a distribution of the probability that the asset 

6 will reach prices within a range of prices at the future time. 
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1 8. A method comprising: 

2 receiving data representing current prices of options on a given 

3 asset, the options being associated with spaced-apart strike prices of the 

4 asset at a future time, 

5 the data including shifted current prices of options resulting from a 

6 shifted underlying price of the asset, the amount by which the asset price 

7 has shifted being different from the amount by which the strike prices are 

8 spaced apart, and 

9 deriving from said data an estimate of a quantized implied 

1 0 probability distribution of the price of said asset at a fixture time, the 

1 1 elements of the quantized probability distribution being more finely 

12 spaced than for a probability distribution derived without the shifted 

1 3 current price data. 

1 9. A method comprising 

2 receiving data representing current prices of options on a given 

3 asset, the options being associated with spaced-apart strike prices of the 

4 asset at a future time, 

5 deriving from said data an estimate of an impUed probability 

6 distribution of the price of said asset at a fixture time, the mathematical 

7 derivation including a smoothing operation, and 

8 making information about said probabiUty distribution available 

9 within a time frame that is usefiil to investors. 

1 10. The method of claim 9 in which the smoothing operation is 

2 performed in a volatility domain. 
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1 11. The method of claim 9 in which the smoothing operation is 

2 performed in the domain of the option prices or in the domain of the 

3 probability distribution information. 

1 12, A method comprising: 

2 receiving data representing current prices of options on a given 

3 asset, the options having strike prices at future dates, 

4 deriving a volatility for each of the future dates in accordance with 

5 a predetermined option pricing formula that Hnks option prices with strike 

6 prices of the asset; 

7 generating a smoothed and extrapolated volatihty function; 

8 and using the volatihty information to generate information within 

9 a time-frame that is useful for investors. 

1 13. The method of claim 1 2 in which the volatility function is 

2 extrapolated to a wider range of dates than the future dates. 

1 14. The method of claim 12 in which the volatihty function is 

2 extrapolated to strike prices other than the strike prices of the options. 

1 15. The method of claim 9 also including 

2 generating a smoothed volatility function using only data that are 

3 reUable under a predetermined measure of reUability . 

1 16. The method of claim 9, further comprising: 

2 generating an impUed volatility function formula having a 

3 quadratic form with two variables representing a strike price and an 

4 expiration date; 
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5 wherein coefficients of the impUed volatiUty function formula are 

6 determined by applying regression analysis to approximately fit the 

7 implied volatility function formula to each of the implied volatilities. 

8 17. A method comprising: 

9 receiving data representing current prices of options on assets 

1 0 belonging to a portfoUo, 

1 1 deriving from said data an estimate of an implied multivariate 

12 distribution of the price of a quantity at a future time that depends on the 

13 assets belonging to the portfolio, and 

14 making information about said probability distribution available 

1 5 within a time firame that is useful to investors. 

1 18. A method comprising: 

2 receiving data representing values of a set of factors that influence 

3 a composite value, 

4 deriving from said data an estimate of an impUed multivariate 

5 distribution of the price of a quantity at a future time that depends on 

6 assets belonging to a portfoho, and 

7 making information about said probabiUty distribution available 

8 within a time firame that is useflil to investors. 

1 19, The method of claim 1 8 in which the mathematical derivation 

2 includes generating a multivariate probabiUty distribution function based 

3 on correlations among the factors. 

1 20. A graphical user interface comprising: 



Attorney Docket 11910-002001 

34 



2 a user interface element adapted to enable a user to indicate a 

3 future time; 

4 a user interface element adapted to show a current price of an 

5 asset; and 

6 a user interface element adapted to show the probability 

7 distribution of the price of the asset at the future time. 

1 21. A method comprising: 

2 continually generating current data that contains probabihty 

3 distributions of prices of assets at future times, 

4 continually feeding the current data to a recipient electronically, 

5 and 

6 the recipient using the fed data for services provided to users. 

1 22. A method comprising: 

2 receiving data representing current prices of options on assets 

3 belonging to a portfolio, 

4 receiving data representing current prices of market transactions 

5 associated with a second portfolio of assets, and 

6 providing information electronically on the probability that the 

7 second portfolio of assets will reach a first value given the condition that 

8 the first portfolio of assets reaches a specified price at a future time. 

1 23. A method comprising: 

2 receiving data representative of actual market transactions 
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3 associated with a first portfolio of assets; 

4 receiving data representative of actual market transactions 

5 associated with a second portfolio of assets; 

6 providing information on the expectation value of the price of first 

7 portfolio of assets given the condition that the second portfolio of assets 

8 reaches a first specified price at a specified future time through a network, 

1 

1 24. A method comprising 

2 evaluating an event defined by a first multivariate expression that 

3 represents a combination of macroeconomic variables at a time T, and 

4 estimating the probability that a second multivariate expression 

5 that represents a combination of values of assets of a portfolio will have a 

6 value greater than a constant B at time T if the value of the first 

7 multivariate expression is greater than a constant A. 

1 25. The method of claim 24 in which the probabihty is estimated using 

2 Monte Carlo techniques. 

1 26. A method comprising 

2 defining a regression expression that relates the value of one 

3 variable representing a combination of macroeconomic variables at time T 

4 to a second variable at time T that represents a combination of assets of a 

5 portfoUo, and 

6 estimating the probability that the second variable will have a 

7 value greater than a constant B at time T if the value of the first variable is 

8 greater than a constant A at time T, based on the ratio of the probability of 
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9 X being greater than A under the regression expression and the probabihty 

10 of X being greater than A. 

1 27. A method comprising 

2 defining a current value of an option as a quadratic expression that 

3 depends on the difference between the current price of the option and the 

4 current price of the underlying security, and 

5 using Monte Carlo techniques to estimate a probability distribution 

6 of the value at a future time T of a portfolio that includes the option. 

1 
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ABSTRACT 

Data are received that represents current prices of options on a given asset. 
An estimate is derived from the data of a corresponding impUed 
probabihty distribution of the price of the asset at a future time. 
5 Information about the probability distribution is made available within a 
time frame that is useftil to investors, for example, promptly after the 
current option price information becomes available, 
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COMBINED DECLARATION AND POWER OF ATTORNEY 



As a below named inventor, I hereby declare that: 
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Appendix A 



Three aspects of the invention are: 

1. The recognition of the desirability of displaying to a financial investment customer in real 
time, for example on a World Wide Web site, the probability distribution governing the 
price of a particular asset {e.g., a stock) at a selected future time, 

2. The recognition that such probability distributions can be derived from option prices for 
that asset, or for related assets, which are readily available in real time. 

3. The recognition that probability distributions involving several asset prices simultaneously 
are useful to investment customers in several contexts, especially in exploring hypothetical 
scenarios, and that single asset distributions such as (but not restricted to) the above can 
be meaningfully incorporated into multivariate distributions, manageably determined. 

In this appendix we first describe a basic method for deriving probability distributions for 
single assets from option prices. We next describe improvements on this basic method to address 
various practical issues. Then we take up the multivariate case and show how to extend this 
kind of single asset price distribution, or any other, to the multivariate case. Finally, we consider 
a number of novel multivariate applications, with emphasis on scenario exploration. 

1 Basic method 

A call option is an option to buy an asset (e.j., a stock) at a certain price x (called the strike price) 
on a given expiration date T days in the future. (An option exercisable only on the expiration 
date is called a European-style option; for simplicity we will consider in this discussion only this 
type of option,^) Similarly, a put option is an option to sell an asset at a strike price x on a 
given expiration date. (The "European-style" assumption of no possible early exercise is more 
important here, but can also be ignored for puts that are not too deeply "in the money.") 

Let c{x) denote the price of a call option on an asset at strike price and p{x) the price of a 
put option. Such prices are established by options market-makers. We have realized that such 
prices implicitly contain information about a "market view" of the probability distribution of 
the price of that asset at the expiration date. 

In a simple but precise form, this market view can be stated as follows. Suppose that we were 
given the call price curve c{x) or the put price curve p{x) as a continuous function of the strike 
price X for all x > 0. Then, the second derivative of either the call or the put price curve is the 
market view of the risk-neutral probability density function (pdf) f{x) of the asset price at the 
expiration date. In other words, f{x) = d\x) = p^\x). 

The idea that option prices determine some kind of implied probability distribution is fairly 
well known in the financial literature. The idea that a pdf can be computed by taking the 
second derivative of a continuous option price curve is known in the academic literature, but it 
does not appear to be very well known. For example, the standard textbook "Options, Futures, 

^Even allowing for possible early exercise, most liquidly traded call options without large dividends can be 
treated as if there were no possibility of such exercise, since sale of the option is usually a better alternative; 
therefore, these call options behave similarly to European-style options. 



1 



and Other Derivatives," by John C. Hull (Fourth Edition, 1999; Prentice-Hall) mentions impHed 
probabilities, but not the second-derivative method. The best reference that we have been able 
to find is J. C. Jackwerth and M. Rubinstein/'Recovering probability distributions from option 
prices," J. Finance, vol. 51, pp. 1611-1631 (1996), which has only six prior references. 

The risk-neutral distribution (at a fixed future time T, for a fixed asset) is defined as the price 
distribution that would hold if market participants were neutral to risk, which they generally 
are not. However, many asset pricing theories, such as those underlying Black-Scholes option 
theory and most of the variations found in the Hull book above, allow for the true risk-averse 
asset price distribution to be obtained from the risk-neutral distribution f{x) just by adjusting 
the latter by an appropriate risk premium: If there are no dividends, the true distribution is just 
/(xe^^""*^^)), where ^ - r is the expected annual return rate for the stock in excess of the risk 
free rate r. We use a variation on this simple format, slightly modified to allow for dividends (see 
below), though our invention could also work well with a more comphcated adjustment. In this 
format, a value for /x - r must still be supplied. We use as a default the "consensus estimate" 
taken from the textbook "Active Portfolio Management" (1995) by Grinold and Kahn. These 
authors note a long-term average value of the risk premium to be 6% per year, and suggest 
multiplying this number by the stock's beta to get ji-r. The parameter beta is the slope of the 
line giving a regression of the stock in question against a market portfolio, often taken as the 
S&P 500. This is the well-known CAPM estimate for the expected excess return. Whether good 
or bad, its stature as a consensus estimate makes it suited to our aim of providing a market 
view, though it is only a default. Our invention, which provides the risk-neutral component 
of the probabilities, could work with other estimates for the risk-averse adjustment parameter 
fM — r and with any explicit scheme for adjusting the risk neutral probability density to the 
risk-averse probability density. It is worth pointing out that, for shorter time periods-even a 
month or two--the risk adjustment required is small and generally overwhelmed by fluctuations 
in the risk-neutral distribution itself. 

We give a brief proof that the second derivative procedure gives the correct risk-neutral prob- 
ability distribution. As in Hull, we may calculate the European call or put price as an expected 
value in the risk-neutral distribution. 

If the actual value of the asset on the expiration date is v, then the value of a call option at 
strike price x is maDc{v - x,0}, and the value of a put option is max{a; - v,Q}. If the actual 
value is a random variable with pdf /(^), then the expected value of a call option at x at the 
expiration date is 



The current values c{x) and p{x) may be obtained by discounting ct(x) and pt{x) by e"'"^, 
where r is the risk-free interest rate, but for our purposes, forecasting probability distributions 
at time T, we do no discounting, and henceforth just write c{x) = ct{x), p{x) = pt{x). 

Parenthetically, from these expressions we observe that 




and the expected value of a put option at x at the expiration date is 



Pt{x) = Ev[max{a: - i;,0}] ^ I {x - v)f{v) dv, 

Jo 




p{x) — c{x) = 
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where s* = Ey[v] is the expected value of the asset at the expiration date under the risk-neutral 
distribution. (If there are no dividends, then s* = se^'^] if there are dividends, then in general 
it is necessary to subtract from se^"^ the value at time T of the dividends.) This well-known 
relation is called put-call parity; it shows why either price curve carries the same information. 

Prom the above expression for c(x)^ it follows that its first derivative is 

poo 

c'(x) - - / f{x) dx = F{x) - 1, 

Jx 

where F{x) = f{v) dx is the cumulative distribution function (cdf) of the random variable 
V, To prove this, note that v — x = dz. Therefore 

POO /*oo pv POO poo poo 

c{x)= {v-x)f{v)dv= / dv dzf{v)= dz dv f{v) ^ dz{l-F{z)), 

Jx Jx Jx Jx Jz Jx 

where we interchange the variables f , z to integrate over the two-dimensional region % = {(v^ z) : 
X < z <v}. The last expression implies that cf{x) = -(1 - F{x)). 

Prom put-call parity, it follows similarly that 

p^x) = l + c'(x) 

Since the cdf and pdf are related by F'{x) = /(^), these expressions in turn imply that the 
second derivative of either c(x) or p{x) is the pdf f{x): 

c"(x)=/(x) = F'(a;) = /(x). 

The general character of the option price curves c{x) and p{x) is therefore as follows: 

• For all x less than the minimum possible value of v (i.e., such that F{x) = 0), c{x) = 
^vM —x= — X and p{x) = 0. In other words, c{x) is a straight line of slope —1 starting 
at c(0) = E^;[t'] = 5*, while p{x) = 0. 

• For all X greater than the maximum possible value of v (i.e.; such that F{x) = 1), c{x) = 0 
and p{x) = a: — 5*. In other words, p(x) is a straight line of slope and x-intercept 5*, 
while c{x) = 0. 

• These two line segments are joined by a continuous convex U curve whose slope increases 
from —1 to 0 for c(x), and from 0 to +1 for p{x). 

We note that the fact that the mean Ev(;['y] of the pdf /(x) is 5*, the value in future dollars at 
time T of the underlying price s (less the value of any dividends), implies that option prices must 
be constantly adjusted to reflect changes in the underlying price 5, even if there is no market 
activity in the options. 

The fact that s* = Ey[v] also implies that an option price curve can make no prediction about 
the general direction of the underlying price s. However, the option price curve does predict the 
shape of the pdf /(x), and in particular its volatility. 
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1.1 i^jpooKirnadbians based on a finite subset of bidr-askBci option prices 

In practice, option prices c{x) and p{x) are quoted only for a finite subset of equally-spaced 
strike prices x, namely x = nA for integer n and spacing A. We denote c(nA) and p(nA) by Cn 
and pnj respectively. Moreover, quotes specify only a bid-asked spread, not exact prices. In this 
subsection we give methods for dealing with these problems. (Most of the Jackwerth-Rubinstein 
paper {op. cit.) is concerned with these kinds of curve-fitting problems.) 

The first derivatives c'(x) and p^x) at a; = (n-h ^)A may be estimated by the first differences 

^^+^~ A ' ^^+1' A 
The corresponding estimates of the cdf = F{{n + |)A) are 

ThuSj using both bid and ask prices for both European-style puts and calls, one can compute 
four different estimates for the cdf F^^ i , which can then be combined into a single estimate. 

This combination will preferably take into account whether a: — (n + j)A is much less than 
the underlying price s ("deep out-of-the-money"), near 5 ("near the money"), or much greater 
than s ("deep in-the-money"), according to the different patterns of setting bid-asked spreads in 
these different ranges. Another consideration is avoiding quotes near prices where early exercise 
is likely, such as deep in-the-money puts. 

Similarly, the second derivatives cf\x) and p^'{x) at x = nA may be estimated by the first 
differences of the estimates of the first derivatives; e.g., 

We may take c"n or p''^^ or some combination as above as our estimate fn of the pdf /(nA). 

Note that since f{x) > 0, option prices should satisfy a convexity condition, e.g., c^+i -2cn + 
Cn-i > 0 for call option prices. Indeed, violation of this condition would allow making money 
via a risk-free "butterfly straddle" involving buying one call option at (n -h 1) A and another at 
(n - 1)A, and selling two call options at nA. A similar result holds for put options. 

1.2 Dynancic estimates 

The methods considered in the previous subsection allow estimation of the cdf and pdf at a 
subset of A-spaced values of based on a static set of option quotes at a particular time. 

As previously noted, however, option prices must change continually in response to changes 
in the underlying price s. Let 5* denote the corresponding forward price at expiration (the 
price s evaluated with interest). Suppose this price (measured in dollars at expiration) moves 
up (or down) by a small amount, an increment e in its logarithm, say, with little or no change 
in volatiUty. Here e may be viewed as, approximately, the percentage move 5/s* caused by a 
move of S in the (forward) stock price. We expect in this situation that (forward) probability 
distribution for the stock price will just be shifted by e in the log domain. That is, the distribution 
will appear to be identical there, except with a mean shifted by e. Thus, the value of the new cdf 
at X = e^"^^ is F(e(*^^~^)) = F{x/a), where F denotes the original cdf with distribution mean 
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5*, and a = e^. A reasonable call price functional equation that gives the same effect, upon 
differentiation, is 

ac{s^^x/a) = c(a5*,a:), 

where c(s*,a:) denotes the price, in dollars at expiration, for a call option at strike x when the 
underlying is at price s. Note in this equation that all other variables, such as volatility, are 
assumed to be the same, which will only be approximately true, even for very small values of 6. 

But, assuming this approximation, we can think of an option price at strike x, measured when 
the (forward) price has moved to as*, for a near 1, as giving instead a times the price of an 
option at strike xja^ but corresponding to the current underlying price 5*. Considering all 
the strikes at which options are frequently quoted, and thinking additively, we can effectively 
observe c{x) (and p{x)) for a different subset of approximately equally-spaced strike prices, 
roughly x= nL —5 ior various values of 5 = es*. Some care must, of course, be taken to ensure 
simultaneity of prices, of option and underlying. For this reason, we may prefer to consider the 
values of nA (corresponding to the various standard strike values) separately, and synchronize 
observed time of sales for an option at a given strike with the underlyling security. Implied 
volatilities (discussed below) could be monitored, to ensure their changes relative to e were 
small. 

Using a similar technique to that described in the above paragraphs, meaningful average option 
prices for a given strike can also be computed, using thin strike intervals and using either short 
time intervals or time series methods (time averages weighting the present more than the past). 
Note that, without the framework described in this subsection, the computation of "average" 
option prices at a given strike are problematic when the stock price varies in the period over 
which the average is taken. 

To summarize: Given enough movements of the underlying price, we can effectively observe 
prices and compute estimates as above for a much more finely quantized subset of strike prices 
x^ and provide a framework for improving accuracy through averaging methods. 

2 Methods for extrapolation and smoothing 

There are two major limitations to the basic methods of the previous section. One is that option 
quotes are available only for certain expiration dates. Another, not so obvious, is that option 
quotes are reliable primarily for options in which there is substantial market activity. These 
would typically be nearer-term options at strike prices near the money (the underlying price). 

To extend our prediction methods to times other than expiration dates and over wider ranges 
of strike prices (and also to help reduce "noise" in our displays), we use extrapolation and 
smoothing techniques. We have found that it is advantageous to do extrapolation and smoothing 
in the volatility domain. 

There are many reasons for this advantage. For example, option practitioners are well aware 
of the kinds of shapes that the volatility curves (sometimes called "volatility smiles") have had 
historically, in various markets, and how these curves vary with time; this can be a guide to 
imposing structure on the smoothing curves to prevent overfitting of possible artifacts. Many 
records have been kept of the volatilities implied by option prices, and it is easy to examine how 
in the past they have changed with respect to price behavior. For example, the Chicago Board of 
Options Exchange makes public its average near-the-money volatility index (now called vix) for 
Si^P 100 options back to 1986. Finally, it is easier to work visually with volatility curves, which 
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would theoretically be flat if f{x) were lognormal, than with visual differences in near-lognormal 
pdfs, which can all look very much ahke. Mathematicallyj model improvements can be made 
in the volatility domain just by changing coefficients of low-degree polynomial approximations, 
even though these affect higher-order terms in power series for the corresponding cdfs or pdfs. 

The following subsections explain more precisely how to work in the volatility domain. 
2.1 Lognorrnal pxdfe 

The standard Black-Scholes theory of option pricing (see Hull, op, cit,) yields a lognorrnal pdf 
f(v) whose expected value is Ev[v] - 5*, such that In is a Gaussian (normal) random variable 
with variance a^T, where the parameter a is called the volatility rate of the asset, and T is 
the time to expiration. By a standard property of lognormal distributions, this implies that the 
mean of Inv is E^;[lni;] = Ins* - 
Prom this pdf follows the famous Black-Scholes call option pricing formula [Hull, Appx. IIA]: 

c{x) = Ey[ma,x{v - x,0}] = s*iV(di(x)) - xN{d2{x)), 

where N(di(x)) and N{d2(x)) are values of the cumulative distribution function of a Gaussian 
random variable of mean zero and variance 1 at the points 

d2{x) + aVf] 
E^[ln^'] —Inx 

(Recall that our version of the call price is not discounted, and is given in dollars at time T, and 
that s* is today's stock price, valued in dollars at time T, less the value of any dividends,) Note 
that aVT is the standard deviation of Inv; therefore -d2{x) is just Inx, measured in standard 
deviations from the mean E^pnt;]. 
Similarly, by put-call parity, we have the Black-Scholes put option pricing formula 

p{x) - c{x) + x~ 5* = s*(iV(di(a;)) - 1) - xiN{d2{x)) - 1) = xiV(-d2(x)) - s*JV(-di(x)). 

Taking the derivative with respect to x, and using 5*JV'(di (x)) = xW{d2 {x)) and d!^ {x) = d^ {x) 
(the latter equation holding under the assumption of constant volatility, which we will later 
drop), we obtain 

F{x) - c'(a;) + 1 - -N{d2{x)) + 1 - N{~d2{x)), 

Now F{x) is the probability that v <x^ which is equal to the probability that Int? < In x, which 
since Inv is Gaussian with mean E^^pnt;] and standard deviation a^/T is given by 

F{x) = Vx{v < x} - Pr{lni; < Inx} - iV J^IE^zMH^ = N[~d2{x)), 

Thus we have verified that the Black-Scholes pricing formulas give the correct cdf F{x), The 
derivative of F{x) will thus yield the correct lognormal distribution /(x) = F\x). 



ln(3Vx) + a^r/2 _ 

^^^"^^ = 

hi{s*lx)-o^T/2 _ 



d-i{x) = 
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2.2 Qiaracterization of general cdfe 



Now let F{x) be an arbitrary cdf on K^.; i.e., a function that monotonically increases from 0 to 
1 as a: goes from 0 to infinity. For simplicity we will assume that F{x) is strictly monotonically 
increasing; i.e., f{x) = F'(x) > 0 everywhere. Then there exists a continuous one-to-one 
"warping function" y : 1+ — > 1 such that F{x) = N{y{x)) everywhere; Le., such that the 
probability that a random variable v with cdf F{x) will satisfy v <x is equal to the probability 
that a standard Gaussian random variable n with mean zero and variance 1 will satisfy n <y{x). 
Similarly, there is an inverse warping function x{y) such that F{x{y)) = N{y). 

Given the warping function y{x)^ the cdf F{x) may be retrieved from the relation F{x) = 
N{y{x)). Therefore the cdf F{x) completely specifies the warping function y(x), and vice versa] 
i.e., both curves carry the same information. 

If F{x) is the cdf of a lognormal variable v such that Inv has mean E^^flnt?] = Ins* ~ 
and variance al = cr^T, as in the previous subsection, then the warping function is given by 

, , , Inx - (Ins* - lux - Ey[lnv] 
yix) -d,{x) = -j= . 

For this reason we may sometimes write y{x) as ~d2{x), even when the cdf is not lognormal so 
that the right-hand equation above for {x) does not hold. 



2.3 Lxplied volatilities 

If f{x) is not lognormal, then the Black-Scholes pricing formulas do not hold. Nonetheless, given 
an option price c{x) or it is common practice to define the imphed volatility a{x) as the 
value of cr such that the Black-Scholes pricing formula holds, for a given x, s and T. 

The implied volatihty curve a{x) so defined is a function of the strike price x^ which is constant 
if and only if the pdf f{x) is actually lognormal. In practice, it is typically a convex U curve, 
called a "volatility smile." See, e.^., Hull, Chapter 17. 

From Subsection 2.1, we can see that there is a second method of calculating implied volatihties, 
as follows. Suppose that we have an estimate of the cdf F{x), Define the cdf-imphed volatility 
(ri(x) as the value of a such that the Black-Scholes cdf formula F{x) = N{-d2(x,a,T)) holds, 
for a given x, s and T. 

The first method has the advantages of being defined directly from raw price data, and of being 
well understood in the financial community. However, the second method has the following 
advantages: 



1. It is easier to calculate, at least from estimates of F{x)] 

2. It gives a simpler and arguably more intuitive relationship between volatility and the cdf 
F{x), If we use the traditional implied volatility cr(x), then the relationship is instead 

dc 

F{x)^N{-d,{x)) + —a\x), 

3. It fits better with the multivariate theory to be developed below. 
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We have observed that the two curves a{x) and ai (x) seem to be fairly similar, at least as to 
the direction of their slope, and are generally not too far apart in value "near the money". Also 
ai{x) = a{x) whenever a{x) has zero slope, though ai{x) is a little smaller than a{x) when the 
slope a^x) is negative (which often occurs for stocks). See the above equation. Finally, one 
function is as ad hoc as the other. Therefore, because of the above reasons, we generally prefer 
to use the cdf-implied volatility curve o'i{x). 

In any case, it is clear that either a(x) or ai {x) contains the same information as any of the 
curves c(x), p{x), F{x) or f{x). From a{x) or <7i{x) we can recover c{x) or F{x) using the 
Black-Scholes call option pricing or cdf formula, and from this we can obtain all other curves. 



2,4 Extrapolation and sncrx^hLng in the volatility ctofnain 

The volatility curve a{x) or ai {x) may be calculated pointwise from the corresponding curve 
c{x) or p{x) to give a set of values at a finite subset of strike prices x. Each of these values may 
be deemed to have a certain degree of reliability. 

It is then a standard problem to fit a smoothed and extrapolated curve a{x) or di {x) to these 
points, taking into account their relative reliabilities. Any standard smoothing and extrapolation 
method may be used. In general, the usual problems of avoiding overfitting or oversmoothing 
must be addressed. 

It is well-known that implied volatilities also vary with time. We generally wish to estimate 
curves a{x,T) or ai{x,T) as replacements for the constant volatility a in the Black-Scholes 
formulas, e.g., c{x) = c{x,a,T) or F{x) = N{~d2{x,a,T)), 

In an especially meaningful example, we have experimented with a class of smoothing algo- 
rithms used in "Implied volatility functions: Empirical tests," by B. Dumas, J. Fleming and 
R. E. Whaley, J. Finance, vol. 53, pp. 2059-2106, Dec. 1998. These authors fit an imphed 
volatility curve cr{x), for the purpose of setting up a "strawman" option price model for testing 
(and defeating) a theory regarding the role of volatility in option pricing. Their "strawman" 
option pricing model c{x) was obtained by putting the resulting smoothed curve back into the 
Black-Scholes call formula. It is a "strawman" ad hoc model, because no intuitive notion of 
stock volatility could possible vary with strike price, which the stock never "sees." Nevertheless, 
their model performed admirably, surpassing in predictive power the highly regarded "imphed 
tree" method. One possible explanation offered was that their model mimicked in a smooth way 
interpolation methods actually employed by practitioners in the options markets. (See the dis- 
cussion of "Volatility matrices" in Hull, cited above.) Such an approach to option pricing seems 
ideal to us, because of its accuracy and because its underlying rationale represents a market 
view. Thus, we use the Dumas-Fleming- Whaley model for our own entirely different purpose, 
that of forecasting probability distributions. All that is necessary is to differentiate their call 
price model, which, conveniently for us, is a smooth function of strike price and other standard 
variables such as time, current stock price, and the risk-free rate of interest. The formula for 
the cdf F{x) is, as before, this derivative with 1 subtracted, or 

F{x) = N{--d2{x)) + ^a\x). 
We can make this very explicit. We have 

^ = xVfN'{-d2{x)) 
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where N\z) denotes standard normal density, while a^x) may be computed by differentiating 
the Dumas-Fleming- Whaley fitted volatility curve. The latter has the form 

C7(x, r) = ao + aix + a2a:^ + a^T + a^T'^ + a^xT, 

The coefScients {ai} are determined by regression. This kind of quadratic curve-fitting is easily 
implemented. Dumas-Fleming- Whaley impose a constraint to prevent their volatilities from 
going below 0 (or even below 0,01), and we have imposed further constraints on extrapolations 
(which we often carry out beyond the range of their tests), to ensure that the final cdf does not go 
below zero or above one. We have experimented with other variations on their basic approach, for 
example, using linear interpolation in the time domain, where we do not need to take deivatives. 
Our methods would, of course work, with any approach, possibly quite different, to volatility 
curve-fitting, though the general Dumas-Fleming- Whaley approach has many things going for 
it: accuracy, conformity to marketplace use of Black-Scholes, smoothness (differentiabiUty, in 
particular), conformity to historical experience regarding the smile structure of volatility curves 
(especially important for extrapolation), and simpMcity (which, beyond ease of implementation, 
helps avoid overfitting) . These advantages are achieved in a probability context that was not 
considered in the paper where these volatility curves were introduced. 

3 The multivariate case 

The methods in the previous sections are capable of generating a display of raw or smoothed 
and extrapolated probability distributions for any optionable asset. Option prices are quoted 
on a large number of securities, as well as on certain indices, such as the S&P 500. 

However, an investor would also like to know future probability distributions for: 

• His or her entire portfolio; 

• Mutual funds; 

• A security without a quoted option; 

• A security in a hypothetical scenario. 

All of these questions involve considerations of several securities at once, and the probabilities 
of their simultaneous configuration of prices. This is clearly a consideration in the first two 
items above, but also enters in the third, where we would want to extract as much information 
as possible about the security without a quoted option price from those correlated with it that 
do have quoted options. Finally, in scenario analysis there are many questions that involve 
considering the probabilities of several security prices occurring at once, including changes in 
factors influencing the market that might be modeled by changes in a portfolio of those securities 
most affected. We will take all of these issues up in the remainder of this document, but for now 
we just try to give a basic introduction. 

For a portfolio of securities, or a mutual fund, we are interested in a composite asset of the 
form 

X ~ hixi + /12X2 + h hnXn^ 

where the Xi are all assets for which we individually know the cdf F{xi) or the pdf f{xi). To give 
our method the most flexibility, we do not require that this knowledge come from any particular 
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procedure, though we favor the approach of the preceding two sections. However, even for some 
securities or indices with a quoted option, we might not feel there was sufficient option activity 
to justify a full fit of a volatility curve, and might take a cruder substitute, even a flat straight 
line based on an average of available implied volatilities. In addition, it is convenient to allow 
the possibility that a few assets we are monitoring might not have any quoted option at all; this 
is easily accommodated by, say, using a flat volatility curve with a historical value for volatility. 
For testing purposes and comparisons we might even want to consider a list of assets with all 
volatility curves given this way. In any case our methodology here is very general, and we only 
require that we know warping functions yi{xi) such that F{xi) ~ N{yi{xi)) for all i. If the 
asset has an active options market, then the warping functions may be determined by either 
first estimating F{xi) directly from (finite differences of) options price data, as in subsection 
2.2, or by using the approach discussed later in Section 2 of extrapolating and smoothing in the 
volatility domain. In the latter case we have an explicit form of the warping function yi{xi) in 
terms of a fitted volatility curve ai{xi,T) as yi{xi) — —d2{xi^a\{xi,T),T)^ and this equation 
can also be used with any volatility curve with the assets above, that might have fewer or no 
traded options. In a later section we will discuss portfolios in the logarithm domain, possibly 
containing long and short positions. One can think of warping them to standard normal directly, 
subtracting the mean and dividing by the standard deviation. Alternately, to keep our notation 
uniform, one can invent an asset with price Xi such that —d^ {xi) gives this warped value (using 
for a\{x) the observed historical volatility). But we wish to emphasize that the method we are 
describing works with ANY single- variable warping functions, even using a different one for each 
variable. The only further substantitive ingredient is the plausibility of using JOINTLY normal 
distribuitions, which we now discuss. 

The general problem is to find a multivariate probability distribution for the complete set of 
variables (a:;i , , . . , Xn)^ or equivalently for their logarithms. In simple financial models generaliz- 
ing the Black-Scholes framework, the multivariate distribution of the logvariables is multivariate 
(i.e. J jointly) normal; see Musiela and Rutkowski's book "Martingale methods in financial mar- 
kets" (1999). This implies that all portfolios of these logvariables are jointly normal, and can 
also be used with other logvariables and portfolios of them to form a jointly normal distribution. 
Thus, if we wish, it is reasonable to use BARRA (or functionally equivalent) factors as single 
(log) variables in our model, using, say, individual normal distributions for them based on histor- 
ical volatility. These factors may represent fundamentals of companies or even macroeconomic 
variables such as interest rates. We do not further discuss such factors, but refer to the book 
of Grinold and Kahn cited above, which also describes how to closely approximate them as 
portfolios of security returns. Our preference is to not use BARRA factors directly, but stay as 
much as possible in the world of optionable securities, and address questions involving BARRA 
factors in terms of approximating portfolios consisting mostly of optionable securities. (But for 
testing and comparisons, it is still useful to be able to include them directly, and we do have 
that capability.) 

Now we certainly do not wish to use only the simple multidimensional Black-Scholes model, 
which would not directly allow the nonlognormal input from our single-variable distributions 
based on the options markets. At the same time, option prices on individual assets do not tell 
us anything about how assets interact, in particular, their correlations. Fortunately, correla- 
tions may be estimated from past (historical) data, and may be viewed as covariances for data 
that has been standardized (has standard deviation 1). Each multivariate normal distribution 
is determined by its mean and covariance matrix. Thus, a natural approach is to use the in- 
dividual distributions to transform or "warp" the variables to standard normal, then impose a 
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multivariate normal structure based on the correlation matrix. This procedure is independent 
of the individual warping functions, which may be different for different individual variables, 
and in particular, can incorporate our market-based option distributions for individual variables 
representing securities with active options markets. A slightly different approach is to use cor- 
relations of the warped variables. This procedure is likely to be more accurate, but may involve 
more computational time. 

We indicate some details. As before, it is notationally convenient to use Vi as a second notation 
for Xj, favoring the latter for fixed values and the former as a variable. Let C be the historical 
correlation matrix of the log variables (lnt;i , . , . , ln?;n), whose entries are the cross-correlations 

_ E (In Vi In Vj) — E (In Vi)E (In Vj) 

Then all diagonal terms pu are equal to 1, and C is a positive semi-definite covariance matrix, 
which we may here assume to be nonsingular (positive definite). If we use instead correlations 
of warped variables, we have simply 

Pij= HViVj)' 

Let us define Fc(yi , . . . , j/n) the cdf of a multivariate Gaussian random variable with mean 
zero and covariance matrix C. Thus, i^c(&ij • • • ^bn) is the probability that each variable yi is 
at most some value 6^. There are more elaborate versions, such as Fc(ai, . . . ,a^; 61, . . . , &n)) 
giving the probability that each yi satisfies ai < yi < bi. In the single- variable case these latter 
functions are obtained from the simple cdf by a single subtraction, involving two terms, but the 
corresponding bivariate case involves four terms, and in n dimensions there would be 2^ terms. 
However, each of these more elaborate cdf's can be directly computed as an integral, just like 
the simple cdf. Since the more elaborate cdf's are needed for Monte Carlo calculations, possibly 
in high dimensions, it is best to think of them as being computed directly. 

We then define the multivariate cdf's 

F{xi , . . . , Xn) = Fcivi (a:i ), . . . , yn{Xn)), 

and 

i^(ai,...,an;6i,,..,6^) Fc{yiiai), . , . ,yn{an)]yi{bi), , . . ,ynibn)) 

where the yi(xi) are the known warping functions for the individual variables. We find it conve- 
nient, with some abuse of language, to speak of F(a;i , . . . , Xn) as "the cdf" , even though we have 
all of the above functions in mind, and to use F{xi , . . . , x^) as a proxy for the whole distribution 
(which it does, theoretically, determine). This multivariate cdf then has the following properties: 

• Since the marginals of Fc{z\ ^. , , ^Zn) are Gaussian with mean 0 and variance 1, the mar- 
ginals of F{xi , . . . , Xn) are equal to N{yi{xi)) = F(xj); i.e., they are correct according to 
each single-variable model. 

• If the logvariables (Int^i , . . . , Invn) are actually jointly Gaussian, then the multivariate cdf 
F{xi , . . , , Xn) is correct. 

In summary, the true joint distribution is approximated by a jointly lognormal distribution 
using historical correlations, combined with warping functions on each variable such that the 
marginal distribution of each variable is correct according to a selected single-variable model 
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(for example, according to our single-variable model for optionable securities, or according to 
the lognormal model using historical volatility). The single variables may actually be portfolios, 
with a default distribution for the portfolio return being lognormal, based on historical volatility. 
This multivariate theory generahzes both our single-variable theory and standard multivariate 
(log) Gaussian models. It again allows for market input through option prices, to the extent 
that components have an active option market, but does not exclude nonoptionable securities, 
and also allows portfolios as single variables. In this way BARRA (or functionally equivalent) 
factors are also allowed because of their interpretation as portfolios of long and short positions. 

4 Applications to portfolios 

Given the multivariate cdf F{xi , . , . ,Xn) = -Pb(yi(^i)) • • • ) yn(^n))j we can answer many typical 
questions. We first give an overview, and then take up some of the apphcations in more detail. 

As one example, suppose that we want to find the cdf of a portfolio variable 

X = hiXi + h2X2 + ' • • + hnXn, 

where the hi are arbitrary coefficients. A simple Monte Carlo method, probably not the fastest, 
is to draw random samples from the jointly Gaussian distribution with cdf Fc{yi,^ . • , yn)-, trans- 
form each yi via the inverse mapping function Xi{yi)^ and then compute the resulting output 
sample 

X ^ hiXi (yi) + . . . + hnXn{yn)^ 

After enough samples, we will have an approximation to the cdf of x. More precisely, the 
probability that a < x < & is, approximately, the average number of samples ^/i , * . . , yri with 
a < /ii 0:1(2/1) + ... + hnXniyn) ^ ^5 s-ud this approximation becomes exact in the limit for large 
sample sizes. This works for real portfolios, or for portfolios constructed from a number of assets 
and a residual variable, as might arise from a regression. Usually the regression is done in the 
log domain, which we discuss below. Note that the Monte Carlo method just described works 
perfectly well if the expression for x above is replaced by any function /(xi, . . . , a;^) of the Xi^ 
possibly quite nonlinear. 

4.1 Log ciarrain portfolios 

In this subsection, we point out how our methods fit with another paradigm in common use in 
the financial community, and set up some further notation. It is common to work in the return 
domain, or equivalently, with logarithms; ie., 

Inx ~ (iilnxi + ' " fin lnx„. 

Ignoring any possible identification of these variables with those in the previous section, the same 
discussion and Monte Carlo method as above applies, if we regard x as a nonhnear portfolio 
X - fi^i J • • • ; ^n) = exp(/?i Inxi + - ' Pn Inxn). If the sum B of the Pi's is 1, such an x may be 

written x ~ hixi + h2X2 + h hnX^ where hi = Pix/xi, Even if B is not 1, incremental changes 

("returns") din x computed from this equation for x are consistent with the above expression for 
Inx. It is common in the financial community to think of hi as approximately a constant /i^, so 
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that for short periods, where the x[s do not change too much, this equation for x is comparable 
to the portfoUo equation in the previous subsection.^ 

For an asset x not given exphcitly in terms of the terms of the x^, we obtain a similar expression 
via linear regression: 

Ina: = lna:o + /?i Inxi + • • • + ^n^nxn, 

The /?4 for i 0 are correlation coefficients chosen to minimize the variance of the residual in 
historical data (perhaps subject to constraints, such as > 0 and Ya= \ /^i = 1)* For example, 
x might be an security without a quoted option, and the xi for i 7^ 0 could be taken as assets for 
which we individually know the probability distributions, in addition to the required correlation 
coefficients for x. We have written the residual term as /3o Inxo (usually thinking of ^0 = 1 ^^id 
the residual as normally distributed).^ The mean of the latter could be nonzero, giving the 
regression "alpha" — a constant term making the mean of the regression correct. Alternatively, 
we could modify the equation to allow an explicit alpha, and keep the residual mean zero. 
Another minor variation might include the addition of a dummy variable with constant return, 
to adjust the value of x up or down. In particular, this gives another way of adjusting the 
residual mean to zero. This equation gives the previous one as a special case if we allow /3o - 0. 

4.1.1 Fast fits of portfolios 

One approach, which promises to be relatively fast computationally, is the following. As in the 
development of cdf-implied volatilities in Section 2, let us assume that each log variable luxi 
above is "Gaussian" with nonconstant variance ai{xifT. In other words, the cdf is given by 
F{xi) = N{—d2{xi^ai{xi),T)), Our aim will be to give F{x) by a similar equation, using some 
kind of fitted curve ai (x). We will assume that we have some class of volatility curves in mind, 
with a small number of parameters which must be determined. 

If the variables Inxo, . . . ,lnxn were truly jointly Gaussian, then In a; would also be Gaussian. 
Its variance would be given by the formula 

Var(lna:) = y^^^jCJiPij&jCFjT, 

where pij is the correlation between Inxi and IxiXj^ and = Var(lnxi). We therefore define 
the estimate ai {x) of a\ (x) by the conditional expectation 

cri{x) = L{^iai{xi)pij^j(Ti{xj) | Inx = ^^iluxi), 

i 

The calculation of the above conditional expectation may be done with Monte Carlo methods. 
In the language of nonUnear portfolios above, we would take the function f{xi^ . . . ,0;^) to be 
0 outside a thin multidimensional solid enclosing the hyperplane defined by Int; = Y^^j3ilnvi). 
Inside the solid we would take f{xi , . . . , Xn) equal to the above expression for Var(ln x), divided 
by the probability of being in the solid (also a Monte Carlo calculation). In terms of samples, we 
just take the average of Var(lnrr) over all the samples that end up inside the thin solid. However, 

'Thus ^ = A^ln Xi^ so that for small changes dxi the change dx from the first equation is 

approximately the same as would be obtained from the second. However, this relationship requires "rebalancing" 
to remain a good approximation for longer periods. 

^For the residual term i = 0, we can use a constant variance, or impose some generic nonconstant structure 
based on observed behavior. 
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it is not necessary to compute all values of <7i(x), but only enough to fit the parameters for the 
volatility curves we are using. 

The estimated mean of Inx would be In 5* - |cri (x)^ , with 5* determined as before, or replaced 
with some risk-averse estimate, to obtain the risk-averse or "true" distribution. (It is common, 
incidentally, to use factor models such as these to estimate a risk-averse version of Ins* = 
Pi In 5* from risk-averse values of s*.) 

Also, we mention here one useful variation: We may prefer not to view the residual term 
111 ^0 as part of the model, and instead write down a joint pdf only for In , ... , In x^* In this 
case we can use the double expectation 

ai{x) = B{E{Y^Piai{xi)pijl3jai{xj) \\nx= ^^ilnx^)), 

where the inner expectation is with respect to the variables a;i,X2, . . . and the outer ex- 
pectation is with respect to the residual. We might take the standard deviation (j{xq) of the 
residual (taking ^Sq = 1) as a constant, determined historically, or make an estimate based on 
some leverage model. 

Now we can estimate the cdf F{x) by 

F{x) = N{-d2{x,ai{x),T)) 

as in the univariate case. To summarize, we use our multivariate model to determine parameters 
for a univariate model of the portfolio. After that is done, we can obtain probabilities for the 
portfolio without having to go back to the multivariate model, thus achieving a savings in 
time. We could take this one step further and think of randomly generating values of ai{x^T) 
independently of any Monte Carlo philosophy (but perhaps still throwing away values of x too 
far out-of-the-money), and then using the values obtained to do the regression required in the 
Dumas-Fleming- Whaley approach. 

5 "What-if^' questions 

The multivariate distribution lends itself to the study of many questions regarding conditional 
probabilities. For example, suppose that we want to know the effect of the increase or decrease of 
some segment of the market on a portfoho, or the increase or decrease of some macro-economic 
factor. BARRA, following earlier ideas of Ross, has viewed such macro-economic factors as 
portfolios with both long and short positions. Similarly, BARRA considers market segments 
associated to price-to-earnings ratios and other fundamental parameters, as well as to industry 
groupings, as portfolios. (See the book of Grinold-Kahn cited above.) Thus, we are led simply 
to consider the effect of one portfolio on another. 

For definiteness, let us suppose the first portfolio is x, where as above 

Inx = po Inxo + ft Inxi + - • + pn^nxn, 
and the second portfolio is y, where 

lny= 70 Inyo + 71 Inxi + ••• + 7rilnxn. 

We take = 7o = 1? and view Inrro and e — Inyo as residuals with mean 0. The latter residual 
is not assumed to be a factor in our multivariate model. Consider the following typical "what-if " 
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question: Let A and B be given positive constants. If we know x > A ai time T, what is the 
probability that y > 5 at time T? We give two approaches to this problem, the first probably 
quicker, but possibly not as accurate, using a regression to avoid at least some Monte Carlo 
calculations. 

5.1 '^What-iP': Aa E^^proadti iirvolving paort regression, part IVfante Carlo 

We have Iny > InB iff In y— 5 > InB — e. All correlations pij between Inxi and Inxj are assumed 
known. We may also assume that we have historical values of volatilities cr^ = ^/VarQu Xi). 
(Alternatively, we could estimate such values as expected values of implied volatilities, but it 
would not be difficult to maintain an inventory of historical values, and more in the spirit of 
this part of the calculation to do so.) Thus we can estimate the historical covariances between 
Inrr and Iny — e: 

Cov(lnx,lny -s) ^ ^ISiaipij^jajT, 

as well 

^Xxix ~~ \/Var(ln x), <7in y—^ — Y^Var(lny — ^) and the correlation 
. . _ Cov(lna;, Iny — £:) 

^In x^\n y—e 

This gives a standard regression for the variable Iny — e expressed in standard deviations from 
its mean, in terms of a similarly standardized expression for Inx. Note that e has mean 0 by 
construction. Put d2 (s*, x, a) = ~ — — ' Thus — ^2 (^^ j x) measures standardized In x 
using historical volatility, and — ^2 {x) = — ^2 {^x^^y (^)) ni^^^sures "standardized" (warped) In x 
using the cdf-implied volatility curve ai (x), as discussed in the previous section. Here 5* denotes 
our best estimate for the value of x at time T. 

Let ai{y^e) denote the volatility curve associated with Iny - which may be estimated as 
in the previous section (or computed from estimates of ai (y) and the standard deviation of 
the residual, if we are willing to view the residual as uncorrelated with Iny — e, as is guaran- 
teed in unconstrained regression). Put d2{y,s) = d2(5y,ye~^,ai(y,e)), so that —d2{y,s) is a 
"standardized" measure of Iny — Then the standard regression appropriate to our model is 

-d2{y,e) = p{e){-d2ix)). 

There is a residual associated with this regression, which we have not written down. It is pre- 
sumably normal, and its variance may be computed. For notational reasons we will just imagine 
it has been incorporated into the original e. As is apparent from the form of the expressions 
in the display, an alternative to the above regression is to do it with the warped correlation 
coefficients suggested in the previous section. If, in addition, it was appropriate to view the 
original portfolios as linear combinations of warped variables (our standard normal marginals), 
the regression above could be done without any recourse to Monte Carlo calculations. Simi- 
lar remarks would apply if we used constant historical volatility functions throughout, though 
presumably the latter procedure would lose accuracy. 

In any case, we can now answer our "what-if " question as a simple expectation in the univariate 
normal distribution of the (adjusted) residual Abbreviate d2{sl^A^ai{A)) to d2{A) and 
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d2{s*,Be ^,ai{B,e)) to d2{B,£). Assume p{e)>0 (the natural case of a positive correlation). 
Then we have 

Pr{y >B\x>A} = E (Pr(-d2 {y, e) > -^2 {B, e)\-d2 (x) > d2 (A)) 

= E{Pv{-d2{x) > p{e)-\-d2{B,e)) \-d2{x) > -d2{A))) 
= E (min{l, N{p{e)-' [d^ {B, e))/N{da (A))}). 

The first equation follows just because —d2{yy€) is monotonically increasing as a function of y] 
that is, the condition that y>Bis completely equivalent to the condition — ^2 e) > —d2 (B, s). 
Similar remarks hold for the condition x> while the expression Pr{y > B \ x> ^4} just means 
the probability that the condition y > B holds when it is known that x> A. The second equa- 
tion is then derived with the displayed expression above for —d2{y,e), (If p{e) is negative, the 
inequality involving its inverse reverses.) This inner expectation is then calculated in the normal 
distribution. For values of £ for which ~d2{A)) is as large as p(£)~^ (-^2(5,5)), the expecta- 
tion is a certainty, and yields the value 1. When —d2{A)) is smaller than p{e)~^{—d2{B,s))^ 
its cumulative normal distribution value N{—d2{A)) is smaller than N{p{e)~^{~d2{B,£))^ and 
the probability 1 — N{—d2{A)) — N{d2{A)) that the standard normal variable z = —d2{x) 
is at least ~d2{A) is smaller than the corresponding probability 1 — N{p{e)~^ (—d2(B^s)) ~ 
Nipie)-^ {d2 (B, e)) that z be at least (-^2 (B, e). The ratio Ar(p(£)~^ (^2 {B, e))/N{d2 {A)), 

which is the desired inner expectation, is thus smaller than 1, as is appropriate for a probabil- 
ity, conditional or not. If p{e) is negative, similar reasoning leads instead to the expression 
E(max{0,(iV(c?2(A)) -iV(p(£)-^(d2(5,£)))/i\r(d2(^))}) for the desired conditional probability. 
Although the final answer in either case is an expectation (over e), it is essentially an integral 
that could be computed quickly with power series. (A very simple and accurate power-series 
expansion of N{z) is given on p. 252 of the book by Hull cited above.) Using that, one could de- 
termine by iterative methods what value of e makes, say, the ratio N{p{e)~^ (^2 (-B, e))/N{d2 {A)) 
equal to 1, and then integrate the ratio against the standard normal pdf from —00 to the deter- 
mined value of £,in the p{e) > 0 case. Similar remarks apply if p{€) < 0. (Note that, if p{e) — 0, 
the variables In a; and Iny are uncorrelated, and the conditional probability Fr{y > B \ x > A} 
is the same as the unconditional probability Pr{y > B}.) 

All of the latter calculations can be done very fast. Of course, we have already used some 
Monte Carlo calculations to get this far, unless we are in the simplified context of constant 
volatility functions. 

5.2 ^Wiat-if Ihe fuU IVbnte 

It is easy to say how we would compute an answer to the same 'Vhat-if" question, using our 
full joint probability distribution. We simply write 

Pr{t/ >B\x>A}= E{PT{-d2{y,e) > -d2{B,e) \ - d2{x) > ^d2{A)) 

and interpret \nx in — d2(x), and Iny — e in —^2(2/5^) in terms of their expansions in Inxo, 
Inxi , . . . , Inx^. To compute, say the inner expectation by a Monte Carlo calculation, we would 
generate a large number of random samples of multivariate standard normal vectors z with 
covariance matrix C We then take the average, over the samples z which happen to satisfy 
z > ~d2 (^), of the function which is 1 when — ^2 (?/, e) > —d2 (5, e) and 0 otherwise. We have not 
experimented to see whether this method yields better answers than the regression procedure 
above. Nevertheless, it illustrates how we could approach more sophisticated "what-if" questions 
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that could not be easily treated by regressions. For example, suppose we believe that factor w 
will remain in a range C < w < and ask the same question about 2/, subject to the same 
condition on x. This is hard to formulate in terms of regression, and is simply not possible in 
terms of single-factor regression. However, it is easy to answer with the full distribution: 

PT{y >B\x>A,C<w<D}^ 

E(Pr(-d2(2/,£) > -d2{B,e) \ -d^ix) > d2{A)),-d2{C) < -d^iw) < -dap)). 

Finally, we may not want to work in the log domain, which, if we started with a fixed portfolio 

X = hiXi + h2X2-\ 1- hnXn would force us into an approximation, as noted. But, working with 

the full distribution, we can phrase a condition x> A as hiXi{yi) + . . . + hnXniVn) ^ in the 
language of the first section where the vector of y^s plays the role of our vector z here. Monte 
Carlo calculations can now proceed as before, using log domain expressions or not for the other 
conditions. 

6 "You gotta believe" questions 

In the previous section we were focusing on an investor thinking about the value of his or her 
portfolio y in response to the change in a factor x. Conversely, an investor might want to know 
what the investment world looks like if a given stock or index y goes to a certain level B at time 
r. What is the expected value A at time T of another portfoho x, or simply of one of the factors 
Xi? Our main plan is, upon input by the user that y is going to level B, to list several assets Xi 
or factors/indices x most highly correlated with y and their expected values with j/ at B. 

It would also be possible to display a confidence interval for each selected asset or factor, 
and have other information about its new projected probability distribution readily available. 
We could also offer comparisons with the old projected probability distribution of x, where no 
assumptions on y is made. Finally, in some cases, where it was possible to explain much of 
the variance of y with just a few Xi (appearing in the regression of y), we could list percentage 
increases/decreases of a portfolio of these Xi required to make 5 the expected value of y, based 
solely on its dependence on this portfoho. (For example, the coefficients in the portfolio could 
come from the regression of y with respect to all the x^, or some new regression might be done, 
perhaps allowing user-defined constraints). It should be mentioned that medians or modes 
are alternatives to expected values (means) here and above; in any case users will need to be 
educated about the fact that the median and mode differ systematically from the mean in near 
lognormal distributions. 

The main problem might be viewed as understanding the probability distribution of x, given 
that y > B at a given time T, with x and t/ as in the previous section. This can be approached 
by the methods of the previous sections, by reversing the roles of the variables. 

There is, however, a simpler question that can be treated in an especially quick way Consider 
the problem of determining the mean of x conditioned on the equality y = J5 at time T. The 
idea is to use simple regression methods, but interpret answers as measured in terms of our 
variable volatilities. In our previous notation, we have a regression 

-d2{x) = p'{-d2{y,e))+v 

where p (which we called p{e) in the previous section) is the historically determined correlation 
between Inx and the random variable Iny — s. Note that the roles of dependent and independent 
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variable are reversed. There is also a residual which has mean 0 here, and plays no role (gets 
averaged away). Thus, the desired conditional expected value ^ of a: is obtained from 

-d2 (A) = E{~d2 {x)\y=B)=p^ E{-d2 (4, Be-^ a, {y, a))) - p • (-^^2 (4,5, a, (y, e)). 

Recall that ax (y, e) is an estimate, obtained by Monte Carlo methods, of the implied volatility 
ai associated to the random variable Iny — e. For faster but less accurate calculations it can be 
estimated historically as Yji^j^i^iPij^j^j '^^^'^ of the a's, /3's, and /?'s here given histori- 
cally. (See the previous section for notation.) Similarly, for fast calculations, — ^2 (^) could use 
historical volatility, though we expect it to be given more accurately, or rather, more accurately 
according to the market view, as —^2(^3:) = --d2{s%,x^d\{x)), using the implied volatility func- 
tion estimate ai {x). If a: = ar^ is a single asset or index in our model, then ai {x) = ai (xi) does 
not require a Monte Carlo estimate, but is presumably already available. 

To summarize, the conditional expected values required to answer "you gotta believe" questions 
are easily obtained by regression methods. The accuracy of such answers is enhanced, or at 
least shaped more to reflect market input, when all logvariables are measured in "standard 
deviations," interpreted as our variable volatilities. 

7 Portfolios containing option securities 

We conclude this document by briefly pointing out that our methods, when using full Monte 
Carlo calculations, easily apply to portfolios containing option securities. The well-known idea 
is to think of an option as as a kind of nonlinear portfolio — a quadratic one, to be more 
precise. Thus, an option on a single underlying security with underlying price xi has a price 
approximately x - c+A(xi — 5i)-h(l/2)r(a^i — si)^ for xi near si , where the option was evaluated 
to a known value c. Here A and T are well-known parameters in the options markets, giving 
the first and second derivatives of the option price at si with respect to the underlying security 
price xi . Perhaps the most characteristic feature of options is that they have nonzero T — their 
proportion of increase or decrease with respect to the underlying security price changes as the 
security price changes. Explicit formulas in terms of other standard parameters are available, 
say, in the Black-Scholes theory for both A and F (see the Hull book cited above). Such formulas 
could be obtained by diflFerentiation directly in other theories or when using empirically-fitted 
curves. In any case, once we have such an explicit approximation to x, its probability distribution 
is easily given by the Monte Carlo methods of Subsection 3.1 above. The same method applies 
as well to portfolios containing several options and other securities. 
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